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WHY IS BILINGUAL EDUCATION RESEARCH SO BAD? 
A CRITIQUE OF THE WAtSH AND CARBALLO STUDY OF 
MASSACHUSETTS BILINGUAL EDUCATION PROGRAMS 



Introduction 

If thara is one aapgct of the researeh on bilmgual edueation that all 
Qommentatori agraa on^ it ii that iti quality is deplorable. One major 
problem Is that most of it consists Gi local evaluations with inadequate 
research designi and analyses. These local evaluators are uiually unable or 
unwilling to assemble a comparison group of students who have not had 
bilingual eduoation, and typically assasi only gaini for the students in 
bilingual education befora and after their participation in the program. To 
their credit, many evaluators forced to. use this model understand they can 
draw no policy conclusions from it. Unfortunately, many do not, and 
numerous reviewers have compounded the error by uncritically citing these 
and other flawed studies as support for transitional bilingual education as 
the best policy alternative for producing the greatest English language 
achievement in children of limited English language proficiency (LEP), 

The second characteristic of the research is that, as Is common with 
controversial social programs with egalitarian goals, the evaluators and thoie 
who review and integrate the research are also passionate advocates of 
bilingual education for political or ideological reasons. The disgraceful 
treatment of linguistic minorities in this country — the mislabeling of 
limited English proficiency children as mentally retarded, their high dropout 
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Or pushout rat© because they have been allowed to flounder in an alien^ 
hoitile environment or actually punished for uiing their mother tongue — 
may have influenced many social icientists, bilingual education lawyers, and 
reviewers of the research to believe that any policy which ignores the 
mother tongue in favor of English Is racist, and any policy which maintains 
the mother tongue, however inadequately, is equitable. This has created an 
atmoiphere in which it is djfiieult for an academic to criticize current policy 
in this field. It has also created an atmosphere in which it is all too easy 
to interpret flawed studies as support for bilingual education and to reject 
or ignore competent, relevant studies with conflicting findings, 

A recent evaluation of transitional bilingual education programs in five 
Ktossachusetts communities by Catherine E. Walsh and Eduardo B. Carballo, 
Transit ional Bilingual Education in Massachusatts: A Prallminarv Study of Xtn 
Effectiveness (April 1986), follows in this unfortunate tradition. The authors 
misrepresent the prsvious research in the field, perform an evaluation so 
inadequate It can tell ui nothing about the effectiveness of any of the 
bilingual education programs they studied, and then rather than apologizing, 
proclaim their study as evidence of the success of transitional bilingual 
education. Specifically, the Walsh and Carballo evaluation has the following 
problems: 

1) The sample of five school districts suffers from "self-selection 
bias." Only school districts willing to be evaluated, presumably 
those with the most effective programs, were included in the 
study. 

2) Probably as a result of their selection criterion, the sample does not 
include a single large, urban school district. Most glaring is the 
omission of Boston and Springfield. 

3) The student samples analyzed are much too small to draw any 
conclusions from. Most problematic is the fact that the control 
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i comferariion group for aach of the five ichool diitricti li eithar 
["y^i^ inadequate or just plain nonexiitent 

Thrne is no itatiitical analysis of tha data nor control for pra- 
exEstim differancas betwaan groupi. 

S) ^ialih and Carballo conduct tha wrong comparison. Sttidente in 
Lr insitional bilingual education are compared toatudents who have 
r^^ceived no help at all rathar than studants in atenativa programi, 

■W h^^t jg BilingUiil Edueation? 

There are five modals of how to instruct children who do not speak 
En^gliih, The firit ii tha old "sink-or-swim*' method which was mada illage^ 
as educational policy by Lau v. Nicholg. 1974 Hera atuddnti are simply 
placed in the ragular classroom with no ipgeial help rigardlais of their 
English language ability. No responiibla educator or locial icientiit advocs-tes 
iucfei a policy iinca there are more humane and effactiva models, 

A second instructional technique ii Engliih as a Second Language 
(ES^^) initruction for one or two pariodi a day^ and "iubmgriion" in the 
regalar classroom for the rest of the day. ESL is a pulkut program Uiuall^ 
baisd on a special curriculum, but the instructors do not have to know the 
chil^d's native language. 

A third policy alternative is structurad immariion whera initruatio^n 
is iim the language being learned (L2), but the teacher knows the studants* 
native tongue (LI). The L2 (i,e. English) used in these programs Is always 
geared to the children*! language proficiency at each stap so that it ii 
com^^prehensible. LI is used only in the rare instances when the student 
can^iot complete a task without it The student thus learni L2 and subject 
mat«er content simultaneously. Immeriion programs in which L2 is not the 
dofl^inant language of tha country typically include at least 30-60 minutes ^ 
day of LI language arts. In fact, most of the Canadian -immersion" progra^ms. 
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whera Enillsh-speaking itrrudants laarn French, eventually beODma bilingual 
pfograttis, 

A fourth, and tjte most widely implemented^ policy alternative is 
transitional bilingual gq^aation (TBE). According to Young, at aL (1984) st 
least 40 pmmt of all Jl^Eitad Engliah prof ioient (LEP) children are now in 
TBE programs^ and only 2**6 percent are in Engliih instruction claiiroomi. 
The other 34 pereaat arg divided aDaong bilingual maintenance, Spanish 
instf uotion, ind ESL ^Immmi By comtrait, Okata, at ah (1983) found no 
proJaQts which reported ''Bnglish on.ly" as a literacy goal for LEP students, 
Hanae, TBE Is clearly th^ * dominant special language initructional progf aiii in 
the UJ. 

In transitional bilLingual education, the student is taught both in the 
native tonpe (LI) and tjite language being learned (L2), with subject matwr 
taught inLL The ^mou^it of instruLc;tional time in LI is reduced, and L2 
inaraasadj until tha itudent tis proficient enough in L2 to join the regular 
instrUQticnal prograiii. majority of elementary school programs are three 

year progfams. The ratldnuale undei^lying TBE differs depending on the ago 
of the child. For very ydui-ng children, it is that learning to read in the 
nativa tmm first is a tt^teeasary condition for optimal reading ability in the 
second tonpe. For all ejjlffldren, it argued that learning a second 
language takes tima md tflLalldfen should not lose ground in other subject 
mattars, particularly mat|i, during the time period they are learning the 
second laniuage, 

A final model ii biUaifual maintenance. Rather than being transitioned 
out of bilingual cducatloiij studenti remain in the program for their entire 
school careir. The goals ^tz thlg moiel are social and intellectual rather 
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than remedial and t&arapeutic as with the other models. Almoit all lueh 
programs iathi Onload States are bilingual eduoatioa magnet programi which 
are raeialiy lod athaically Integrated in order to desegrt^gate a ichool iyitem. 

The WalA md Carballo itudy is an evaluation of transitional bilingual 
education programs Sn five Maiiachusetts eommunitles, Attleboro, Cambridge, 
FraminghanijHavarfc^ill and Holyoke* Their itudy compared studenti in 
traniitional bilinguan education (TBE) to 1) student! who have gk^aduatcd 
from TBE programi z and have been "mainitraamed" into a regular clasiroom 
and 2) a coiitfDl gromp of students who have received no^ or "minimal," 
services* ie. sybmenaon, 

Fravioui Rei^reh 

Walsh mi Cafhoallo begin their evaluation of Massachusetts programs by 
reviewing some of t\^m research in the field. They criticize national studies 
which have ihown trr^ansitional bilingual education to be ineffective and 
praise studies that purport to demonstrate the effactiveness of transitional 
bilingual eduratloii. They do this with no regard for the methodological 
standards of socUI s^^ienea research. 

What isimetlic^dologically sound study? In order to determine whether 
a bilingual edyoation program is suecessful, the research study must have a 
treatment group subjected to the program and a control or comparison 
group, similar to th^ rtreatment group, which has not received that program. 
If studenti have not Itoeen randomly assigned to the control group, there must 
be statistical centf 01 HFor differences between the groups which existed prior 
to the time ona groups ^ received bilingual education. Post-bilingual education 
differences batween groups alleged to have bean caused by the bilingual 
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education program rouit be tasted by maans of an appropriate statistical 
analyiii to determine If the differences are greater than eould have been 
ei^peatid by change Qontrolling for pre-existing differences. 

The reason why it is essential to have a comparison group is beeauie 
all ehildren tend to ihow progress in English language knowledge over time. 
Children of limited-English-proficieney will know more English the longer 
they are in any program regardless of its ef feetiveness. Unless we have a 
comparison group not reeeiving that program, we will not know if that 
increased achievement is more or less than we would expect to occur 
naturally. For examplai if a child enters a transitional bilingual education 
program with an English language scora of 20 and comes out with a score of 
60 that might actually be a negative program effect if similar children in 
other programs^ or in no program at all, are icoring 20 when they enter 
school and SO after the same time period. Hence, a comparison group is 
absolutely essential to program avaluation. Nevertheless, it is missing from 
most evaluations of bilingual education because of the diffieulty of finding 
similar children of limited English proficiency v/ho are not in a bilingual 
education program. Rather than apologizing for the lack of a control group, 
and suggesting that as a result no conclusions can be drawn, all too many 
evaluations conclude that because children know more English after participation 
in the TBE program than before, the program is a success. 

Even if there ig a comparison group, pre-existing differences between 
groups must be statistically controlled for. This is because children with 
higher achievement prior to bilingual education will tend to have higher 
achievement after bilingual education even if the bilingual education lowered 
their English language achievement. In addition, children of higher 
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sdcioMonomie statui will tend to have higher achievemgnt after bilingual 
edueation than children of lower soeioeconomic statui not in the program 
even if the TBE program lowered the achievement of the higher ioeioeconomic 
itudants. 

Not one of the studiei cited by Walih and Carballo as evidence of the 
effectiveneis of transitional bilingual education followed these abiolutalv 
essential rules for determining program effect nor did the reviews they cited 
select only studies which did thatl Ironically, the two studies dismissad by 
Walsh and Carballo as "methodologically at fault" flM follow these esiential 
rules for determining program effect The AIR study (Danoff, et aL, 1977; 
1978) not only had a control group, but controlled for pra-existing differences 
betv/een the students in bilingual education and those in the control group. 
The Baker and de Kanter review (1981) selected only studies that had these 
characteristics. Both the AIR and Baker and de Kanter studies also had 
larger samples than any other study or review to date. They both found 
transitional bilingual education to be ineffective in compariion to other 
programs. Hence, one suspects that what Walsh and Carballo really object 
to are the findings of these two studies, not their methodology. 

Of the methodologically sound studies I reviewed (Rossell and Ross, 
1986), 71 percent found transitional bilingual education to be no different or 
worse than doing nothing in St "^nd language learning and 93 percent showed 

^ For a detailed critique of the studies they cited as evidence of the 
success of bilingual education, see Christine H. Rossell and J. Michael Ross, 
"The Social Science Evidence on Bilingual Education," Center for Applied 
Social Science Working Paper 85-7, Boston University or the same article in 
The Journal of La w and Education. Fall 1986; and Keith A, Baker and 
Adriana de Kanter, Effectiveness of Bilingual Education: A Review of the 
Literature, 198 L Both reviews clearly delineate the standards for 
methodologically acceptable studies and the shortcomings of most itudles in 
the field* 
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it to be no different or worse than doing nothing in math learning. All but 
one eomparison of traniitional bilingual education to itructured immeriion 
showed the lattar to be iuparior in both second languaga and math laarning. 

Thus, Walsh and Carballo begin thair evaluation with an Inaceurate and 
miilaadlng review of tha researoh in this field. 

Unfortunately, their concluiions regarding tha consensus in this field are all 
too accurata. The field of bilingual education is parvaded by a disregard for 
the canons of scientific rasaarch. 

School Dlstriet Sample 

There ara two problems with tha sample of five school districts 
analyzad in the Walsh and Carballo study. First, the sample suffers from 
"salf-seleetion bias," Tha only school districts that were studied were those 
that agreed to ba studied ^ Selecting only school districts willing to be 
studied is unaccaptabla by social scienca rasearch standards because the 
school districts which refuse to participate are likely to be those with 
unsuccessful programs. 

Sacond, thara is not a single large, urban school district in tha 
sample. This is probably a result of Walsh and Carballo*s selection oritarion - 

that is, only school districts willing to participate were studied. The lack 
of a large, urban school district is important because it is easier to implement 
any program on a small scale even if the program itself is not a particularly 

2 It is not clear why this limitation was placed on the study since one 
of the co-authors, Eduardo Carballo, is an administrator in the Massachusatti 
Departmant of Education and presumably could have insisted on all school 
districts participating. 
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good one. Thus, the ganeralizability of any findings from this study will be 
limited. The findings eannot be applied to the two cities, Boston and 
Springfield, with the largest number of limited English profioient students in 
the state. 

Student Sample and Cnntrnl ^^ffiyp 

The sample sItc for two of the three groups of students Walsh and 
Carballo studied is so inadequate as to disqualify the study on these grounds 
alone. To reiterate, the three groups of students studied are: 1) students la 
transitional bilingual education (TBE), 2) students formerly in bilingual 
education but now **mainstreamed*' into regular olassrooms, and 3) a control 
group of students identified by administrators as limited English proficiaat but 
whose parents refused to enroll them in bilingual edueatlon. 

The sample siie in Attleboro is 16 TBE students, II mainstreamed 
students, and 0 control group students. The sample size in Cambridge is 25 
TBE students, 5 mainstreamed students, and 1 1 control group students. Tha 
sample size in Framingham is 27 TBE students, 18 mainstreamed students, and 
3 control group students. The sample size in Haverhill is 18 TBE studants, 18 
mainstreamed students, and 8 control group students. The sample size in 
Holyoke is 43 TBE students, 7 mainstreamed students and 5 Qontrol group 
students. Since all comparisons are by school district, the mainstream and 
control group are completely inadequate before any measures of program 
success, with their accompanying missing data, are analyMd* 
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Sample Slzg and Its Effact on Maasuras of Program Succass 

Attaadaace. On the first measure of program success, days of attendance, 
we are given no information as to the size of the sample in each category 
or celL^ Again, this is unacceptable by social science standards. Moreover, 
TBE itudenti and mainitreamed itudents are inexplicably collapsed into one 
group io that we cannot tell how many mainstreamed students are in each 
category of attendance - 180-160 days, 160-140 days, 140-120 days, and 120 
or leii. In addition, since there is no statistical control for their measure 
of social class (whether a student is receiving free or reduced lunch), we 
also have no idea as to what extent the observed attendance rates are 
explained by the social class of each group. 

Sixty-eight percent of the TBE/mainstream group in Attleboro attends 
school for 180-160 days a year, but since there is no control group in this 
district (and no data on almost 20 percent of the TBE/mainstream sample), 
we have no idea whether this is better or worse than the rest of the 
students in that school district. 



This is true of all tables. The reader never knows how many 
students the authors are analyzing in each category. The exception to this is 
when there are 0 or only 1 student in a celL This is indicatod by an asterisk. 

^ Walsh and Carballo give us no information on the social clavs 
composition of each of the three groups: TBE, mainitreamed, and control 
group. It is thus possible, although we have no way of telling, that the 
control group has much lower social class than the TBE or mainstreamed 
groups. This is one of the many glaring errors which render this study 
unintelligible. Moreover, they only collected social class data on 56 percent 
of the sample in Attleboro, 68 percent of the population In Cambridge, 61 
percent of the sample in Haverhill, and 75 percent of the sample in Holyoke, 
Yet^ they were able to obtain social class data on 51 percent of the 
population in Framingham. Thus, for four out of five school districts, there 
is an unacceptably high rate of missing data on this variable, particularly 
given the small sample size. 
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In Cambridge, 70 percent of tha TBE/mainstream students attand school 
180-160 days a year, but since 80 percent of the aontrol group attend school 
180-160 days, thii seamiugly positive attendance rate is actually a nagativa 
program effect.^ In Framingham, 65 parcent of the TBE/mainstream itudents 
attend school 180-160 days, but since 100 percent of the control group 
attends school 180-160 dayi^ this seemingly positive program affect is 
actually a negative effect* Only in Haverhill and Holyoke do TBE/mainstream 
students have a higher rate of attendance than control group students, but 
given the small size of the control group and the expected missing data, we 
may again be talking about only a couple of itudents. Although for all the 
reasons mentioned we can draw no conclusioni from it, the Walsh and 
Carballo data actually show that in 50 percent of the school districts for 
which there is a control group, the TBE/mainstream students had higher 
attendance and in 50 percent they had lower attendance than the control group. 

Walsh and Carballo may have concluded TBE had a positive effect on 
attendance from their "totals percentages" for the four school districts with 
a control group. These "totals percentages'' show 70 percent of the 
TBE/mainstream itudents attending 180-160 days but only 58 percent of the 



This finding demonstrates how important a control group is in 
evaluating programs. Without a control group, we would have concluded the 
program had a positive effect on children's attendance. With a control 
group, we can see the TBE program actually had a iiegative effect Moreover, 
Cambridge is the only school district on which we are given information (in 
the narrative) about the social class of the TBE group and the control 
group. The control group is of lower class than the TBE group. Whereas 71 
percent of the TBE itudents are on free or reduced lunch, 86 percent of the 
control group are on free or reduced lunch. Thus, despite the fact that 
they were of lower social class, the control group had higher attendance. 

^ Of course, the Framingham control group of 3 students (or less) is 
too small to draw any conclusions from, but Walsh and Carballo do not seem 
to know this. 
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aontrol group students attending 180-160 days a year. Unfortunately, their 
"total percent" does not appear to be a total, but an average of the 
percentages for each school district. The problem with an average is that it 
weights each school district equally rather than by their sample size. Given 
my suspicion that Cambridge is not the only school district where the 
control group is of lower social class than the TBE/mainstream students, a 
statistical analysis controlling for the social class differences between the 
two groups may find the control group to be superior in school attendance. 

Grades, Grades are an unreliable source of program success when 
comparing TBE students to control group students. Students in a class are 
always graded in comparison to other students in their class not to students 
in other programs. Even if none of the students in a class know very much 
English, some of them will still receive high grades. Thus the average grade 
of B for students in TBE tells us nothing about how much English they know 
or are learning. 

The grades of mainstream students are another matter. They are 
competing against English-speaking students in the regular classroom. 
Although there will still be some tendency on the part of teachers toward 
grade inflation, it should be small unless a school or school district practices 
academic tracking. Thus, if the sample size of the mainstream and control 
group were adequate, we might actually have a real possibility of assessing 
program success for these groups with this variable. Unfortunately, both the 
mainstream and control gruup student sample ranges from 0 to 11 before 
analysis of their grades is conducted and missing data is taken into account. 
Of the 20 cells in the mainstreamed and control groups (Table 6, page 44), 
30 percent are empty or have only one student. The rest could have only 
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two students in aach celL It is impossible to do itny valid comparisons with 
such minuscule sample sizes. 

EDgllsh Laoguage Achievement. Clearly, the most important measure of 
program success is English language aehievement. The purpose of special 
language programs for limited English proficient students is to teach them 
English. It is of very little importance if they attend school regularly and 
have high grades in comparison to other limited English proficient students 
if they do not know English. Walsh and Carballo, however, do not consider 
this an important enough variable to present in a table as with the other 
measures of program success. One has to wade through the narrative on 
each school district to determine the effect of transitional bilinguai education 
on English language achievement. 

Unfortunately, although this is the most important variable, the least 
amount of data was collected on it. Of the five school districts with a 
control group in this sample, three (Attleboro, Cambridge, and Holyoke) have 
no achievement data whatsoever for the control group students. A fourth 
school district (Framingham) has one control group achievement scory;, but 
none for the TBE itudents. The fifth school district, Haverhill, has 6 control 
group student achievement scores but only one TBE student achievement 
score. There are, however, achievement scores for 12 malnslreamed students. 
They are doing very poorly. In short, there is not enough data on achievement 
to conduct even the crudest of comparisons, let along the correct one which 
would statistically compare the groups and control for pre=existing differences. 
As Walsh and Carballo admit, elementary TBE students average one year 
below grade level equivalencies and secondary THE students two years below 
grade level equivalencies. What they do not acknowledge is that without a 
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control group we have no idea whether that is a positive program effect, a 
negative one^ or no effect at alL 

StMi&tical Analvsts 

As stated several times in this paper social science research has rules 
for determining whether one has proven one's hypothesis. Walsh and 
Carballo should have statistically compared TBE students to control group 
students, or students in alternative pirograms, and mainstreamed students to 
control group students, or students in alternative programs, to see if they 
differed more than would have been expected by chance given their sample 
size and varfance within each group controlling for the social class of the 
students and their pre-program English language achievement. Walsh and 
Carballo conducted not a single statistical comparison because, given their 
minuscule sample size, they could not. The only question remaining is why 
present any data at all if it is insufficient to be analyzed by valid social 
science methods to determine the effectiveness of the program? 

The Wrong Comparison 

Even had they had an adequate sample and correctly analyzed the 
differences between groups, the Walsh and Carballo study could still have 
been criticized for comparing TBE and mainstreamed students to the wrong 
group. The control group in this study consists of students who received 
no, or "^minimal," services^. I know of no educator or social scientist who 

Walsh and Carballo state that "although control group students are 
ehrolled in the monolingual curriculum, those identified as LEP (limited 
English proficient) are monitored by the TBE program and, depending on 
school location, are offered minimal ESL support" (fn. 4). One cannot help 
but have the suspicion that administrators were able to identify the control 
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would advocata no, or minimal, services for LEP children. Children of 
limited English proficiency need special help and should not be left to sink or 
swim in the regular classroom when there are more effective and humane 
alternatives. 

The only policy proposals that I have seen recommend alternative forms 
of speGlal language help for limited English proficient children. One very 
successful alternative program is structured immersion, described on page 4. 
Moreover, there are some programs similar to structured Immersion in 
Newton and BrookJine.^ The students in these programs could be compared 
to those in transitional bilingual education controlling for social class and 
pre-existing English language ability. In addition, transitional bilingual 
education programs vary in the extent of English used in the classroom. 
Programs which use more English could be compared to those which use less 
English. Many students are placed in ESL programs because the school 



group itudents because they were having academic difficulty and thus were 
being monitored. It is hard to believe^ particularly given the high rate of 
missing^ data in this study that administrators are able to keep track of 
every single student who entered the school system, was identified as LEP, 
but whose parents did not want him or her enrolled in transitional bilingual 
education or for whom there was no program. If they were able to do that, 
the control group in this sample would have been much larger and included 
numerous Asian linguistic minoflties for whom it is impossible to conduct a 
TBE program because there are no certified teachers who speak their 
language. Given the unlikeliness of identifying every single formerly LEP 
student, administratora would naturally tend to identify as LEP students 
those who are currently LEP, that Is having academic difficulties. Thus, 
there might also be a student selection bias here which needs to be controlled 
for, although unfortunately no statistical technique will completely do that. 

^ Despite the fact that the research shows structured immersion to be 
superior to transitional bilingual education and transitional bilingual education 
to be no better than doing nothing, the administrators of these programs 
find themselves to be in violation of state law. Even more administrators 
would be willing to adopt similar alternative models, but lack the courage or 
the funds to fight state agencies. 
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district cannot find a certified teacher who speaks their language or there 
are not enough students with that language to justify allocating a classroom 
and teacher. Cambridge, for example, has some students in ESL programs 
with academic content araa instruction conducted in regular classrooms. 
These students could be compared to TBE students contruiling for pre- 
program differences between the groups. 

None of these data have been collected by Walsh and Carballo or any 
other evaluators. However, it is exactly this kind of information on 
variations in TBE and alternative programs which is needed to assemble a 
respectable "contror or comparison group. The only mystery is why it 
hasn't been done. 
Conclusions 

The Walsh and Carballo study of transitional bilingual education 
programs in five Massachusetts communities does not show that "TBE 
students and mainstream students are much more successful in school than 
are LEP students who have never been served by TBE programs (control 
students)" as the authors claim (p.73). Even if we are to take the data at 
face value and not demand that it conform to the standards of social science 
research, it does not show that. Days of attendance is not a measure of 
success in school^ but even if it were, half the cases show TBE to be 
superior and half show doing nothing to be better. Grades are a better 
measure of success in school, but there is virtually no data for the 
mainstreamed and control group students and so one can come to no 
conclusion regarding success in school when limited English proficient 
students are comparsd only to each other. 
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The best measure of what transitional bilingual education is supposed to 
be accomplishing is a measure of English language achievsment, but there is 
also virtually no comparative data on that,^ Either the TEE group or the 
control group are missing achievement data in all but one of the school 
districts. In the one school district were there ii achievement data for the 
control group and the mainstreamed students, the sample size is so small (6) 
that an analysis of covariance could not be conducted even if Walsh and 
Carballo wished to do so. 

It would be nice if I could say this is one of the worst evaluations of 
bilingual education I have ever read, but that is not the case. The research 
in this field is so bad that this study probably ranks in the top half of all 
evaluations in terms of quality. Most local evaluations do not even attempt 
to assemble a control group nor examine the progress of mainstreamed 
students as Walsh and Carballo did. In a sense then, they are more clever 
than most evaluators since they appeaf to have comparison groups, but 
ultimately do not. 

An important question usually ignored in discussions about the poor 
quality of bilingual education research (Zappert and Cruz, 1977; Okata, 1983; 
Willig, 1981-82), is why is it so bad and why is this tolerated? I believe the 
research is poor because bilingual education is the ideologically "correct*' policy 
alternative. To be in favor of bilingual education, regardless of its effect 
on children, is the •'civil rights" position. To be in favor of alternaiivcs to 



■ No explanation is given as to me why Walsh and Carballo had such a 
hard time collecting data on their outcome measures. School districts 
routinely test all students yearly and new students at the time of admission. 
They also keep track of the attendance of all students and their grades. 
Yet this routine bookkeeping information is missing for more than half of their 
already minuscule sample. 
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bilingual educiitloD is to be reactionary and racist. The raason this has 
occurrad is becausa no other policy alternative allows use of the native 
tongue in instruction and also requires the use of native tongue speakers as 
instructori. Thus, in the minds of many civil rights advocates, this feature 
of bilingual education is so important as to make its effect on English 
language achievement secondary in importance. Nevertheless^ it is obviously 
politically useful to show a positive English languaga achievement effect. 
Since a poor evaluation — that is, one with no comparison group and/or 
statistical analysis will guarantee a "positive" English language achievement 
effect, all but a handful of bilingual education evaluations are of poor quality. 

The elites, academics, and policymRjcers who tolerate such poor research 
generally fall into two groups: those who have been intimidated by the 
bilingual education "establishment" into supporting TBE and those who simply 
do not care enough about the eduyation of immigrant children to determine 
the truth for themselves. Perhaps one of the saddest aspects of the bilingual 
education literature and research is that decent and honorable people who 
were once reformers have become the conservative "estabiishment." They 
have forgotten that the purpose of bilingual education is to help children. 
It is not an end in and of itself. 
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